Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36854] Fix comments and default expressions missing after being transformed #3780

Merged
merged 1 commit into from
Dec 17, 2024

Conversation

yuxiqian
Copy link
Contributor

@yuxiqian yuxiqian commented Dec 5, 2024

This closes FLINK-36854.

Consider the following simple schema:

[id INT NOT NULL COMMENT 'id_field' DEFAULT 17]

This would be sync to downstream sink databases that supports them intactly.

However, things would go wrong after being transformed like this:

transform:
  - projection: id, id AS id_alias, id + 1 AS id_new

Here, id_new column should carry no comments / default expressions obviously, since it has nothing to do with original id technically.

However, for id and id_alias column, it's reasonable to expect that comments and default expressions to be kept. Worse still, the nullability of id_alias will be lost since the previous half is regarded as a complex expression, and nullability of an expression's return type is uncertain.

So, currently the transform will yields the following schema:

[
  id INT NOT NULL,
  id_alias INT,
  id_new INT
]

With this PR, deduced schema will be something like:

[
  id INT NOT NULL COMMENT 'id_field' DEFAULT 17,
  id_alias INT NOT NULL COMMENT 'id_field' DEFAULT 17,
  id_new INT
]

Some extra parts have been changed in TransformParser#generateProjectionColumns since the previous version is a little cryptic.

@yuxiqian
Copy link
Contributor Author

yuxiqian commented Dec 5, 2024

Would @ruanhang1993 and @aiwenmo like to take a look?

@aiwenmo
Copy link
Contributor

aiwenmo commented Dec 5, 2024

Would @ruanhang1993 and @aiwenmo like to take a look?

I will review it.

@aiwenmo
Copy link
Contributor

aiwenmo commented Dec 7, 2024

The supplement for renaming the data type of the original column is great.
For the moment, I can't see where the code needs to be optimized.

Copy link
Contributor

@leonardBang leonardBang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @yuxiqian @aiwenmo for the contribution, +1

@leonardBang leonardBang merged commit 39b560e into apache:master Dec 17, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants